Linear algebra explained in four pages 
Excerpt from the NO BULLSHIT GUIDE TO LINEAR ALGEBRA by Ivan Savov 


Abstract—This document will review the fundamental ideas of linear algebra. 
We will learn about matrices, matrix operations, linear transformations and 
discuss both the theoretical and computational aspects of linear algebra. The 
tools of linear algebra open the gateway to the study of more advanced 
mathematics. A lot of knowledge buzz awaits you if you choose to follow the 
path of understanding, instead of trying to memorize a bunch of formulas. 


I. INTRODUCTION 


Linear algebra is the math of vectors and matrices. Let n be a positive 
integer and let R denote the set of real numbers, then R” is the set of all 
n-tuples of real numbers. A vector y € R” is an n-tuple of real numbers. 
The notation “E S” is read “element of S.” For example, consider a vector 
that has three components: 


B= (v1,02,03) E€ (R,R,R) = R°. 


A matrix A € R™%*” is a rectangular array of real numbers with m rows 
and n columns. For example, a 3 x 2 matrix looks like this: 


Q11 Q12 R R 
A= | an az E€ R R | =R”. 
a31 Q32 R R 


The purpose of this document is to introduce you to the mathematical 
operations that we can perform on vectors and matrices and to give you a 
feel of the power of linear algebra. Many problems in science, business, 
and technology can be described in terms of vectors and matrices so it is 
important that you understand how to work with these. 


Prerequisites 


The only prerequisite for this tutorial is a basic understanding of high school 
math concepts | like numbers, variables, equations, and the fundamental 
arithmetic operations on real numbers: addition (denoted +), subtraction 
(denoted —), multiplication (denoted implicitly), and division (fractions). 
You should also be familiar with functions that take real numbers as 
inputs and give real numbers as outputs, f : R — R. Recall that, by 
definition, the inverse function f~' undoes the effect of f. If you are 
given f(x) and you want to find x, you can use the inverse function as 
follows: f~+ (f(2)) = x. For example, the function f(x) = ln(x) has the 
inverse f—'(a) = e”, and the inverse of g(x) = /z is g+ (x£) = 2”. 


IJ. DEFINITIONS 


A. Vector operations 


We now define the math operations for vectors. The operations we can 
perform on vectors Ñ = (ui,u2,u3) and y = (v1, v2,v3) are: addition, 
subtraction, scaling, norm (length), dot product, and cross product: 


= 


G+ 0 = (u1 + v1, U2 + v2, uz + v3) 


U3) 


= 
u v= (u U1, U2 — V2, UZ 


aŭ = (aur, auz, aus) 


lal] = yu? + u3 + u3 


> 


U: U = ULV + U2V2 + UZU3 


Uxv = (u2v3 — U3V2, UZU1 — U1VU3, U1V2 — uzv) 


The dot product and the cross product of two vectors can also be described 
in terms of the angle 0 between the two vectors. The formula for the dot 
product of the vectors is u- U = ||z||||d|| cos 8. We say two vectors ù and 
v are orthogonal if the angle between them is 90°. The dot product of 
orthogonal vectors is zero: ŭ - y = ||ül||||v|| cos(90°) = 0. 

The norm of the cross product is given by ||& x Ul] = ||a||||v|| sin 8. The 
cross product is not commutative: a x 0 A Ux Ü, in fact Ñ x Y = —7 x W. 


1A good textbook to (re)learn high school math is mini reference. con 


B. Matrix operations 


We denote by A the matrix as a whole and refer to its entries as aij. 
The mathematical operations defined for matrices are the following: 


e addition (denoted +) 
C=A+B => Cig = Qij + bij. 


e subtraction (the inverse of addition) 
e matrix product. The product of matrices A € R™*” and B € R"™* 
is another matrix C € R™** given by the formula 


C=AB > Cij = 5 Qikbkj, 
k=1 


au a| pp b a11b11 + a12b21 @11b12 + a12b22 
11 bia 

a21 a22 f b | = |a21b11 + a22b21 @21b12 + a22b22 
21 b22 

a31 432 a3ib11 + a32b21 a@31b12 + a32b22 


e matrix inverse (denoted A~‘) 
e matrix transpose (denoted TY; 


H a2 s| -= já 
bı b2 Bs it By 


e matrix trace: Tr[A] = 7, ais 
e determinant (denoted det(A) or |A|) 


Note that the matrix product is not a commutative operation: AB 4 BA. 


C. Matrix-vector product 


The matrix-vector product is an important special case of the matrix- 
matrix product. The product of a 3 x 2 matrix A and the 2 x 1 column 
vector 7 results in a 3 x 1 vector ¥ given by: 


Yı Q11 Q12 5 a11%1 T Q122 
= > 1 
Y= Az < y2|=|a21 a22 i |- a21%1 + a22%2 
2 
Y3 a31 432 Q311 T A32%2 
a11 Q12 
=21| a21 | +%2| a22 (© 
a31 Q32 
E 
(aii, ai2) z 
5 
= (a21, a22) o (R) 
z 
(a31, a32) T 


There are tw] fundamentally different yet equivalent ways to interpret the 
matrix-vector product. In the column picture, (C), the multiplication of the 
matrix A by the vector 7 produces a linear combination of the columns 
of the matrix: y = Ag = 21 Aj. 1] + @2Aj.,2}, where Ay, 1) and Aj, 9) are 
the first and second columns of the matrix A. 

In the row picture, (R), multiplication of the matrix A by the vector 7 
produces a column vector with coefficients equal to the dot products of 
rows of the matrix with the vector 7. 


D. Linear transformations 

The matrix-vector product is used to define the notion of a linear 
transformation, which is one of the key notions in the study of linear 
algebra. Multiplication by a matrix A € R™*” can be thought of as 
computing a linear transformation T4 that takes n-vectors as inputs and 
produces m-vectors as outputs: 


Ta : R” — R”. 


For more info see the video of Prof. Strang’s MIT lecture: 


Instead of writing Y = T4(Z) for the linear transformation T4 applied to 
the vector 7, we simply write 7 = AZ. Applying the linear transformation 
Ta to the vector £ corresponds to the product of the matrix A and the 
column vector 7. We say Ta is represented by the matrix A. 

You can think of linear transformations as “vector functions” and describe 
their properties in analogy with the regular functions you are familiar with: 


function f : R—R < linear transformation T, : R” > R” 
input c € R = input g € R” 
output f(z) < output T(z) = AZ € R”™ 
go f=g(f(a)) & Ts(Ta(ë)) = BAZ 
function inverse f~' <> matrix inverse A7! 
zeros of f < N(A) = null space of A 
range of f < C(A) = column space of A = range of Ta 


Note that the combined effect of applying the transformation T'4 followed 
by Tg on the input vector 7 is equivalent to the matrix product BAZ. 


E. Fundamental vector spaces 


A vector space consists of a set of vectors and all linear combinations of 
these vectors. For example the vector space S = span{v, v2} consists of 
all vectors of the form Y = av, + B02, where a and 3 are real numbers. 
We now define three fundamental vector spaces associated with a matrix A. 

The column space of a matrix A is the set of vectors that can be produced 
as linear combinations of the columns of the matrix A: 


C(A) = {7 E€ R” | Y = Ag for some 7 € R”}. 


The column space is the range of the linear transformation T4 (the set 
of possible outputs). You can convince yourself of this fact by reviewing 
the definition of the matrix-vector product in the column picture (C). The 
vector AT contains xı times the 1* column of A, x2 times the 2”! column 
of A, etc. Varying over all possible inputs 7, we obtain all possible linear 
combinations of the columns of A, hence the name “column space.” 

The null space N (A) of a matrix A € R™*” consists of all the vectors 
that the matrix A sends to the zero vector: 


N(A)= {2 ER" | AZ =0}. 


The vectors in the null space are orthogonal to all the rows of the matrix. 
We can see this from the row picture (R): the output vectors is 0 if and 
only if the input vector 7 is orthogonal to all the rows of A. 

The row space of a matrix A, denoted R(A), is the set of linear 
combinations of the rows of A. The row space R(A) is the orthogonal 
complement of the null space M (A). This means that for all vectors 
U € R(A) and all vectors w € N(A), we have ọ- Ñ = 0. Together, the 
null space and the row space form the domain of the transformation T4, 
R” = N(A) Ð R(A), where © stands for orthogonal direct sum. 


F. Matrix inverse 


By definition, the inverse matrix A~' undoes the effects of the matrix A. 
The cumulative effect of applying A~* after A is the identity matrix 1: 


The identity matrix (ones on the diagonal and zeros everywhere else) 
corresponds to the identity transformation: Tı (#) = 17 = 7, for all Z. 

The matrix inverse is useful for solving matrix equations. Whenever we 
want to get rid of the matrix A in some matrix equation, we can “hit” A 
with its inverse A~' to make it disappear. For example, to solve for the 
matrix X in the equation X A = B, multiply both sides of the equation 
by A`! from the right: X = BA7!. To solve for X in ABCXD = E, 
multiply both sides of the equation by D~' on the right and by A~', Bt 
and C’~! (in that order) from the left: X = CBTA ED. 


UI. COMPUTATIONAL LINEAR ALGEBRA 


Okay, I hear what you are saying “Dude, enough with the theory talk, let’s 
see some calculations.” In this section we’ll look at one of the fundamental 
algorithms of linear algebra called Gauss—Jordan elimination. 


A. Solving systems of equations 
Suppose we’re asked to solve the following system of equations: 
lx “Ir 202 = 5; 


1 
321 Sa 9x2 = 21. ( ) 


Without a knowledge of linear algebra, we could use substitution, elimina- 
tion, or subtraction to find the values of the two unknowns x; and zo. 
Gauss-Jordan elimination is a systematic procedure for solving systems 
of equations based the following row operations: 
a) Adding a multiple of one row to another row 
6) Swapping two rows 
y) Multiplying a row by a constant 
These row operations allow us to simplify the system of equations without 
changing their solution. 
To illustrate the Gauss-Jordan elimination procedure, we’ ll now show the 
sequence of row operations required to solve the system of linear equations 
described above. We start by constructing an augmented matrix as follows: 


1 2 5 
| 3 9 | 21 | ` 
The first column in the augmented matrix corresponds to the coefficients of 
the variable xı, the second column corresponds to the coefficients of x2, 
and the third column contains the constants from the right-hand side. 

The Gauss-Jordan elimination procedure consists of two phases. During 
the first phase, we proceed left-to-right by choosing a row with a leading 
one in the leftmost column (called a pivot) and systematically subtracting 
that row from all rows below it to get zeros below in the entire column. In 
the second phase, we start with the rightmost pivot and use it to eliminate 
all the numbers above it in the same column. Let’s see this in action. 


1) The first step is to use the pivot in the first column to eliminate the 
variable xı in the second row. We do this by subtracting three times 
the first row from the second row, denoted Ro — Re — 3R1, 


12/5 
0 3/6 4° 
2) Next, we create a pivot in the second row using R2 — 3 Ra: 
1 2 |5 
0 1 |2’ 


3) We now start the backward phase and eliminate the second variable 
from the first row. We do this by subtracting two times the second 
row from the first row Ry — Rı — 2R2: 


1 0 i1 

0 1/2? 
The matrix is now in reduced row echelon form (RREF), which is its 
“simplest” form it could be in. The solutions are: xı = 1, ro = 2. 


B. Systems of equations as matrix equations 


We will now discuss another approach for solving the system of 
equations. Using the definition of the matrix-vector product, we can express 
this system of equations as a matrix equation: 


f al lel = fa: 


This matrix equation had the form AZ = b, where A is a 2 x 2 matrix, Z 
is the vector of unknowns, and b is a vector of constants. We can solve for 
Z by multiplying both sides of the equation by the matrix inverse A7!: 


=I > Tı -1 3 -5 5 = 1 
a a S bl 


But how did we know what the inverse matrix A~? is? 


IV. COMPUTING THE INVERSE OF A MATRIX 


In this section we’ll look at several different approaches for computing 
the inverse of a matrix. The matrix inverse is unique so no matter which 
method we use to find the inverse, we’ll always obtain the same answer. 


A. Using row operations 


One approach for computing the inverse is to use the Gauss—Jordan 
elimination procedure. Start by creating an array containing the entries 
of the matrix A on the left side and the identity matrix on the right side: 


1 2/1 0 
3 9/0 1j` 
Now we perform the Gauss-Jordan elimination procedure on this array. 


1) The first row operation is to subtract three times the first row from 
the second row: Ro — Rə — 3Rı. We obtain: 


1 2 1 0 
0 3 |-3 14° 
2) The second row operation is divide the second row by 3: R2 — IR? 


1 2 1 0 
| 0 1 |-1 4 | l 
3) The third row operation is Rı — Ry — 2R2 
| d i i q | l 
3 
The array is now in reduced row echelon form (RREF). The inverse matrix 
appears on the right side of the array. 

Observe that the sequence of row operations we used to solve the specific 
system of equations in AZ = b in the previous section are the same as the 
row operations we used in this section to find the inverse matrix. Indeed, 
in both cases the combined effect of the three row operations is to “undo” 
the effects of A. The right side of the 2 x 4 array is simply a convenient 
way to record this sequence of operations and thus obtain A~'. 


B. Using elementary matrices 


Every row operation we perform on a matrix is equivalent to a left- 
multiplication by an elementary matrix. There are three types of elementary 
matrices in correspondence with the three types of row operations: 


Ra: Rı— Rı+mR $ e i 
Rg: Rı e Re e Eg= i 
1 0 
[m 0 
Ry : Ri — mR, > Ey = 0 


Let’s revisit the row operations we used to find AT in the above section 
representing each row operation as an elementary matrix multiplication. 
1) The first row operation Rə — Rə — 3R, corresponds to a multipli- 
cation by the elementary matrix F1: 


Bike|) [sa —|6 a 


2) The second row operation R2 — iR corresponds to a matrix E3: 


E2(B1A) = h H b | ~ f il 


3) The final step, Ri — Ri — 2R2, corresponds to the matrix Fs: 


Es(E2E1A) = E a i | = [o | | 


Note that E3 E2 E1 A = 1, so the product E32 must be equal to Aut: 
@ fi -2]f1 O]f1 oO] _ f3 -4 
e allo A ¥ 

The elementary matrix approach teaches us that every invertible matrix 


can be decomposed as the product of elementary matrices. Since we know 
A`! = E3E2E, then A = (A7+)7t = (E3 E2 P1) = Ey; Ez Ezt. 


C. Using a computer 


The last (and most practical) approach for finding the inverse of a matrix 
is to use a computer algebra system like the one at|live.sympy.org 


>>> A = Matrix( [[1,2],[3,9]] ) # define A 
[1, 2] 
(3, 9] 

>>> A. inv () # calls the inv method on A 
[ 3, =2/3) 
[-1, 1/3] 


You can use sympy to “check” your answers on homework problems. 


V. OTHER TOPICS 


We’ll now discuss a number of other important topics of linear algebra. 


A. Basis 


Intuitively, a basis is any set of vectors that can be used as a coordinate 
system for a vector space. You are certainly familiar with the standard basis 
for the xy-plane that is made up of two orthogonal axes: the x-axis and 
the y-axis. A vector v can be described as a coordinate pair (vz, vy) with 
respect to these axes, or equivalently as Y = vz? + vyj, where 7 = (1,0) 
and j = (0,1) are unit vectors that point along the x-axis and y-axis 
respectively. However, other coordinate systems are also possible. 


Definition (Basis). A basis for a n-dimensional vector space S is any set 
of n linearly independent vectors that are part of S. 


Any set of two linearly independent vectors {€1, é2} can serve as a basis 
for R?. We can write any vector 7 € R? as a linear combination of these 
basis vectors U = v1é1 + v2ée. 

Note the same vector Y corresponds to different coordinate pairs depend- 
ing on the basis used: Y = (vz, vy) in the standard basis B, = {7,7}, and 
U = (vı, v2) in the basis Be = {€1, €2}. Therefore, it is important to keep 
in mind the basis with respect to which the coefficients are taken, and if 
necessary specify the basis as a subscript, e.g., (Vz, vy)B, OF (U1, v2) B,- 

Converting a coordinate vector from the basis Be to the basis Bs is 
performed as a multiplication by a change of basis matrix: 

î 3 êz V1 
rel ol 


Bo LE © kii 


Note the change of basis matrix is actually an identity transformation. The 
vector y remains unchanged—it is simply expressed with respect to a new 
coordinate system. The change of basis from the Bs-basis to the Be-basis 
is accomplished using the inverse matrix: p,[1]s, = (B, [1]B.)~+. 


B. Matrix representations of linear transformations 


Bases play an important role in the representation of linear transforma- 
tions T: R” — R”. To fully describe the matrix that corresponds to some 
linear transformation T, it is sufficient to know the effects of T to the n 
vectors of the standard basis for the input space. For a linear transformation 
T: R? — R?, the matrix representation corresponds to 


E R?2*? 


As a first example, consider the transformation II; which projects vectors 
onto the x-axis. For any vector Y = (vz, vy), we have H,() = (vz, 0). 
The matrix representation of II, is 


an= (fo) ™(h)) ]=[0 oh 


As a second example, let’s find the matrix representation of Rg, the 
counterclockwise rotation by the angle 0: 


v= [eCG e] 


The first column of Mp, shows that Re maps the vector 7 = 120 to the 
vector 10 = (cos 6, sin 0)". The second column shows that Re maps the 
vector j = 12% to the vector 12(3 + 6) = (— sin 0, cos6)". 


— sind 
cos@ |" 


C. Dimension and bases for vector spaces 


The dimension of a vector space is defined as the number of vectors 
in a basis for that vector space. Consider the following vector space 
S = span{(1,0,0), (0,1, 0), (1,1,0)}. Seeing that the space is described 
by three vectors, we might think that S is 3-dimensional. This is not the 
case, however, since the three vectors are not linearly independent so they 
don’t form a basis for S. Two vectors are sufficient to describe any vector 
in S; we can write S = span{(1,0,0), (0,1,0)}, and we see these two 
vectors are linearly independent so they form a basis and dim(S) = 2. 

There is a general procedure for finding a basis for a vector space. 
Suppose you are given a description of a vector space in terms of m vectors 
V = span{t, v2, . . . , Üm } and you are asked to find a basis for V and the 
dimension of V. To find a basis for V, you must find a set of linearly 
independent vectors that span V. We can use the Gauss-Jordan elimination 
procedure to accomplish this task. Write the vectors y; as the rows of a 
matrix M. The vector space V corresponds to the row space of the matrix 
M. Next, use row operations to find the reduced row echelon form (RREF) 
of the matrix M. Since row operations do not change the row space of the 
matrix, the row space of reduced row echelon form of the matrix M is the 
same as the row space of the original set of vectors. The nonzero rows in 
the RREF of the matrix form a basis for vector space V and the numbers 
of nonzero rows is the dimension of V. 


D. Row space, columns space, and rank of a matrix 


Recall the fundamental vector spaces for matrices that we defined in 
Section the column space C(A), the null space \V(A), and the row 
space R(A). A standard linear algebra exam question is to give you a 
certain matrix A and ask you to find the dimension and a basis for each 
of its fundamental spaces. 

In the previous section we described a procedure based on Gauss—Jordan 
elimination which can be used “distill” a set of linearly independent vectors 
which form a basis for the row space (A). We will now illustrate this 
procedure with an example, and also show how to use the RREF of the 
matrix A to find bases for C(A) and N (A). 

Consider the following matrix and its reduced row echelon form: 


1 3 3 3 1 3 0 0 
A=|2 6 7 6 ref(A)= |0 0 1 0 
3 9 9 10 0 0 0 1 


The reduced row echelon form of the matrix A contains three pivots. The 
locations of the pivots will play an important role in the following steps. 

The vectors {(1, 3,0, 0), (0,0, 1,0), (0,0,0, 1)} form a basis for R(A). 

To find a basis for the column space C(A) of the matrix A we need 
to find which of the columns of A are linearly independent. We can 
do this by identifying the columns which contain the leading ones in 
rref( A). The corresponding columns in the original matrix form a basis 
for the column space of A. Looking at rref(A) we see the first, third, 
and fourth columns of the matrix are linearly independent so the vectors 
{(1, 2,3)", (3, 7,9)", (3,6, 10)"} form a basis for C(A). 

Now let’s find a basis for the null space, (A) = {a € R* | AZ = 0}. 
The second column does not contain a pivot, therefore it corresponds to a 
free variable, which we will denote s. We are looking for a vector with three 
unknowns and one free variable (x1, 5, x3, xa)" that obeys the conditions: 


1 3 0 0 [e] 0 læ +3s = 0 

0 o 1 Oo} / "| =J0 => lez = 0 

0 0 o a |*? 0 Iza = 0 
T4 


Let’s express the unknowns x1, £3, and x4 in terms of the free variable s. 
We immediately see that x3 = 0 and x4 = 0, and we can write xı = —3s. 
Therefore, any vector of the form (—3s, s,0,0), for any s € R, is in the 
null space of A. We write N (A) = span{(—3, 1, 0,0)"}. 

Observe that the dim(C(A)) = dim(R(A)) = 3, this is known as the 
rank of the matrix A. Also, dim(R(A)) + dim(NV(A)) = 34 1 = 4, 
which is the dimension of the input space of the linear transformation T4. 


E. Invertible matrix theorem 


There is an important distinction between matrices that are invertible and 
those that are not as formalized by the following theorem. 


Theorem. For an n xn matrix A, the following statements are equivalent: 
1) A is invertible 
2) The RREF of A is the n x n identity matrix 
3) The rank of the matrix is n 
4) The row space of A is R” 
5) The column space of A is R” 
6) A doesn’t have a null space (only the zero vector N(A) = {0}) 
7) The determinant of A is nonzero det( A) # 0 


For a given matrix A, the above statements are either all true or all false. 

An invertible matrix A corresponds to a linear transformation T'a which 
maps the n-dimensional input vector space R” to the n-dimensional output 
vector space R” such that there exists an inverse transformation T3" that 
can faithfully undo the effects of TA. 

On the other hand, an n x n matrix B that is not invertible maps the 
input vector space R” to a subspace C(B) Ç R” and has a nonempty null 
space. Once Tg sends a vector w € N(B) to the zero vector, there is no 
T° that can undo this operation. 


F. Determinants 

The determinant of a matrix, denoted det(A) or |A|, is a special way to 
combine the entries of a matrix that serves to check if a matrix is invertible 
or not. The determinant formulas for 2 x 2 and 3 x 3 matrices are 


Q11 12 
= 411022 — a12421, and 

a21 Q22 

Q11 Q12 Q13 
a22 Q23 a21 Q23 a21 Q22 
a21 Q22 Q23| = Q11 — a12 + ai3 . 
a32 Q33 a a33 a a32 

431 Q32 433 


If the |A| = 0 then A is not invertible. If |A| # 0 then A is invertible. 


G. Eigenvalues and eigenvectors 


The set of eigenvectors of a matrix is a special set of input vectors for 
which the action of the matrix is described as a simple scaling. When 
a matrix is multiplied by one of its eigenvectors the output is the same 
eigenvector multiplied by a constant A@, = AZ). The constant A is called 
an eigenvalue of A. 

To find the eigenvalues of a matrix we start from the eigenvalue equation 
Ay = XE), insert the identity 1, and rewrite it as a null-space problem: 


Aéy = A18 > (A— Al) & =0. 


This equation will have a solution whenever |A— A1| = 0. The eigenvalues 
of A € R"*”, denoted {A1, A2,..-, An}, are the roots of the characteristic 
polynomial p(A) = |A — Al]. The eigenvectors associated with the 
eigenvalue À; are the vectors in the null space of the matrix (A — \;1). 

Certain matrices can be written entirely in terms of their eigenvectors 
and their eigenvalues. Consider the matrix A that has the eigenvalues of 
the matrix A on the diagonal, and the matrix Q constructed from the 
eigenvectors of A as columns: 


Ai sts 0 | | 
A= i ot Q= Ea e Ean) then A=QAQ. 
0 0 An | | 


Matrices that can be written this way are called diagonalizable. 

The decomposition of a matrix into its eigenvalues and eigenvectors 
gives valuable insights into the properties of the matrix. Google’s original 
PageRank algorithm for ranking webpages by “importance” can be 
formalized as an eigenvector calculation on the matrix of web hyperlinks. 


VI. TEXTBOOK PLUG 


If you’re interested in learning more about linear algebra, you can check 
out my new book, the NO BULLSHIT GUIDE TO LINEAR ALGEBRA. 


A pre-release version of the book is available here: 


